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ABSTRACT 


This thesis examines the feasibility of passively fingerprinting network 
reconnaissance tools. Detecting reconnaissance is a key early indication and warning of 
an adversary’s impending attack or intelligence gathering effort against a network. 
Current network defense tools provide little capability to detect, and much less 
specifically identify, network reconnaissance. This thesis introduces a methodology for 
identifying a network reconnaissance tool’s unique fingerprint. The methodology 
confirmed the utility of previous research on visual fingerprints, produced characteristic 
summary tables, and introduced the application of TCP sequence number analysis to 
reconnaissance tool fingerprinting. We demonstrate the use of these methods to 
fingerprint network reconnaissance tools used in a real-world Cyber Defense Exercise 
scenario. 
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I. INTRODUCTION 


At the onset of the Vietnam War, an Ameriean pilot’s first indieation that a 
Surfaee-to-Air Missile (SAM) targeted him was to physieally see the missile raeing 
toward him. Mounting losses due to SAMs led the United States to develop a plan to 
ineorporate the latest in Electronie Warfare Support (ES) technology into specialized 
fighter aircraft to identify and target the enemy SAM sites. That program, which came to 
be known as “Wild Weasel,” underwent incremental development throughout the war. 
The Wild Weasel aircraft introduced the capability to detect the SAM’s search and track 
radars, their reconnaissance, directly within the strike group [1]. The Wild Weasels were 
successful at detecting, classifying, and rapidly engaging the enemy SAM sites. The 
continued advancement of Wild Weasel and ES technologies throughout the war enabled 
American fighters to know and react to key information about their enemy’s behavior. 
Was the enemy radar conducting a general area search? Was it in tracking mode? Did 
the radar just shift to weapons guidance mode, signaling a missile was about to be fired 
[2]? This kind of information gave American pilots the edge they needed to turn the tide 
in the skies over Vietnam and in every air-ground conflict since. 

A similar progression of search, track, classify, and attack exists in cyberspace. 
We can also develop analogous tools to detect and specifically identify, or fingerprint, the 
tools adversaries use to conduct their reconnaissance leading up to an attack. Just as in 
the example above, this information could enable computer network defenders to react 
more quickly and effectively. Detecting reconnaissance is a key early indication and 
warning of an adversary’s impending attack or intelligence gathering effort against a 
network. Eingerprinting the reconnaissance tools used can provide important information 
to create a more robust defense-in-depth posture, creating the opportunity to learn more 
about an adversary’s behaviors and enable defensive cyber maneuver or counter attack. 
It also provides for informed decision-making about the appropriate response to initiate. 
Eor instance, one could reason about the appropriateness of a response in terms of jus ad 
bellum or jus in bello. 
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This thesis examines the feasibility of passively fingerprinting eomputer network 
reeonnaissanee tools. Several eommon seanning reconnaissanee tools are analyzed, 
demonstrating a methodology to identify the unique fingerprint eharaeteristies of the 
tools and deseribing these fingerprints for each tool. To provide a practical real-world 
example of using these fingerprints, we analyzed data captured from the Naval 
Postgraduate School’s (NPS) participation in the 2008 Cyber Defense Exercise (CDX) to 
uniquely identify a network reconnaissance tool used by the attackers against the NPS 
CDX network. The thesis concludes with recommendations for follow-on research. 

A, OVERVIEW 

This thesis is organized into four main chapters that provide background 
information, describe methods, and demonstrate passively fingerprinting computer 
network reconnaissance tools. This thesis makes extensive use of computer networking 
protocols and computer security. For the reader desiring additional information on 
computer networking, the author recommends [3]. For a thorough introduction to 
computer security, the author recommends [4]. 

The second chapter is further subdivided into two sections. The first describes the 
phases of a typical cyber attack. The second identifies where many current 
reconnaissance detection tools fall short. 

The third chapter describes the author’s experimentation methodology from setup 
to data collection and analysis. 

The fourth chapter details the results of the experiments for each reconnaissance 
tool examined. Detailed fingerprint information is shown in the form of visualization 
illustrations and summary tables. 

The fifth chapter provides a demonstration identifying a network reconnaissance 
tool from data captured by the NPS team in the 2008 Cyber Defense Exercise. The 
captured data was analyzed with the methodology described in this thesis, which resulted 
in the specific identification of a reconnaissance tool used during the Cyber Defense 
Exercise. 

The final chapter summarizes the techniques developed and identifies specific 
areas for further research. 
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II. THE STRUGGLE FOR CYBERSPACE DOMINANCE 


This chapter is divided into two main seetions. The first defines the phases of a 
typieal cyber attack. The seeond deseribes some of the shorteomings of eurrent computer 
network defense tools. 

A, ANATOMY OF A CYBER ATTACK 

Many different authors have deseribed the phases of eyber attaeks in different 
ways. The speeific terminology may be slightly different, but eategorizing the phases of 
attaek by the physieal aetions taking plaee on the network for eaeh phase provides the 
most universal applieability. We adopt the following three phase deseription from [4], 
[5], and [6]. 

1. Reconnaissance 

The first phase of a eyber attaek is reeonnaissanee. It ean be further sub-divided 
into three ineremental stages: easing, seanning, and enumeration. For easing, the attaeker 
need only use a web browser to researeh openly available information. For seanning, the 
attaeker begins to use speeialized tools to send paekets to identify live hosts and diseover 
information about ports on those hosts. For enumeration, the attaeker uses tools to send 
speeifieally erafted paekets to determine specifie services being run on the target 
maehines. Throughout the reeonnaissanee phase, the attaeker refrains from breaking into 
the potential target. The goal here is to narrow down the target list and query to gain 
more specifie information about those targets. 

a. Casing 

At the very start of attack planning, the attaeker identifies and sets the 
seope of the target areas to attack. The terms “easing” and “footprinting” are eommonly 
used to deseribe this stage of reeonnaissanee. This referenees an analogy to bank robbers 
“easing” a target bank by taking note of readily observable information about the bank’s 
seeurity posture. Similarly, in eyberspaee, the attaeker does not yet send aetive probes 
against the target while casing. Rather, the attacker may use an ordinary web browser to 
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research openly available information that can be useful in the subsequent phases of 
planning and attack. This includes news articles, browsing the target’s Web site, phone 
numbers, and Internet registry “Who-is” searches to identify the target’s Internet Protocol 
(IP) address space. These searches may reveal the names and phone numbers of key 
personnel that may be later targeted for phishing schemes, information on the Web site 
developer and hosting company, physical addresses and specific IP address to target or 
the limits of the target’s IP space. 

b. Scanning 

Scanning is active reconnaissance. At this stage, the attacker begins 
overtly sending packets to the target IP address or range of IP addresses (the target’s 
network). The attacker’s goal is to determine what machines, or “hosts,” are present and 
reachable on the target network. These scanning probes are akin to search radar scanning 
the skies, sending out radio energy and listening for the characteristic radio echo return 
from target aircraft. Just as detecting and recognizing the characteristic patterns of radars 
provides critical early indication and warning that an adversary is searching for aircraft, 
the same can be said of cyberspace reconnaissance scanning. Detecting the 
reconnaissance scans provides the first indication to alert the network defender of a 
potential impending attack. Observing and classifying the unique fingerprints of these 
reconnaissance-scanning tools is the purpose of this thesis. 

The attacker may perform scans that utilize a number of different protocols. The 
two most common are Internet Control Message Protocol (ICMP) “pings” and 
Transmission Control Protocol (TCP) “SYN scans.” The specialized reconnaissance 
tools that the attacker may use can also perform scans using more obscure protocol 
parameters such as the ICMP timestamp request. User Datagram Protocol (UDP), and 
TCP “ACK scans,” to name a few of the possibilities. Every possible combination of 
ICMP, UDP, and TCP parameters may be used by the attacker to sneak past network 
defenses like firewalls and Intrusion Detection Systems (IDS) to minimize the possibility 
that the defender becomes alerted to the reconnaissance scans. 
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There are several different kinds of seans based on where the scans originate from 
and the target destination. A one-to-one scan of multiple UDP or TCP ports is known as 
a vertical scan. A one-to-many scan of a single UDP or TCP port is kn own as a 
horizontal scan. Combining the multiple port aspect of the vertical scan across the 
numerous targets of the horizontal scan constitutes a scan sweep. See Figure 1 for an 
illustration of these scanning concepts. More advanced attackers may distribute the 
origins of their scan across multiple machines under their control; this type of distributed 
scan makes it challenging for the network defender to identify the true source of the 
reconnaissance scans. Once the attacker knows the status of the ports, especially the 
open ports, on the target machines, the next type of reconnaissance begins. 



Figure 1. Illustration of Major TCP Scan Types 
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c. Enumeration 

Enumeration is the proeess of identifying speeifie serviees running on the 
target hosts. It may be performed by the same tool and at the same time as the 
reconnaissance scan or by a specialized enumeration tool. For the most part, scanning 
involves sending packets that still properly adhere to the various protocol specifications. 
For enumeration, the attacker begins sending packets that are intentionally outside the 
protocol specifications, or mal-formed, in order to test the response from the target. The 
unique characteristics of each target system’s hardware and software implementations 
enable the attacker to specifically identify these characteristics based on the response 
from the target to the enumeration packets. Suppose a network administrator tries to be 
sneaky by hiding a service like a telnet server, which normally resides at TCP port 23, at 
a different port number, such as 49150. The attacker can still uncover this telnet server 
through enumeration. Thus, by this point in the reconnaissance process, the attacker 
knows which target machines can be reached, what ports are open, and specifics such as 
what operating system (OS) the target is running and what service is running on each 
port. With this information, the attacker can match the target’s characteristics to specific 
vulnerabilities to attack. 

2, Attack 

The second phase, the actual attack, involves gaining privileged access (root) on 
the target machine and then maintaining that access. 

a. Gaining Access 

The goal of the attack is to gain privileged (root level) access to the target 
machine so that the attacker usurps control. The attacker may use one or more types of 
attack, such as buffer overflow, user-password, application-specific, or man-in-the- 
middle (MITM) attacks. Depending on the attacker’s position to access the targeted 
network, the attacker may be able to simply intercept user authentication credentials 
passing over the network to log in masquerading as an authorized user. Novice “script 
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kiddies” may use plug-and-play exploitation tools to gain aeeess. More advaneed 
haekers may deploy specifieally engineered zero-day exploits to take eommand of their 
target while evading network defenses. 

b. Maintaining Access 

After expending signifieant effort to find, identify, and infiltrate a target 
maehine, the attacker sets up a means to revisit the target machine and remotely execute 
code. This can be accomplished via Trojan horses, backdoors, or root kits. The attacker 
may also employ anti-forensics tools to erase logs and other evidence that the target 
machine has been compromised. 

3, Plunder 

With the target machine fully under control of the attacker, the attacker is free to 
plunder the target, to complete their subversion objective. The attacker may exfiltrate 
data from the compromised machine, use the machine as a “hot” for denial-of-service 
(DoS) attacks, use the machine as a springboard to launch attacks further into the target 
network, or for other purposes. 

B, THE CURRENT STATE OF ATTACK DETECTION 

1. Intrusion Detection Systems (IDS) 

Intrusion detection systems (IDS) are designed to detect unwanted traffic on a 
computer network. IDS may be deployed to observe, or “sniff,” network traffic across a 
specific section of the network, the entire network, or on individual host machines. The 
IDS determines whether or not packets traversing the network are legitimate by 
comparing them against the signatures of known attack vectors and anomalies, or 
departures from normally observed baseline traffic [7]. Since the primary purpose of an 
IDS is to identify malicious packets, block them, and issue alerts, the IDS may be 
constrained by resource limitations on processing the enormous amount of network 
traffic data. Consequently, many IDS do not provide extensive information regarding the 
apparently benign attributes of network reconnaissance scans [5]. 
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SNORT, the most popular open-source IDS and de-facto industry standard for 
capabilities, provides a functional module to detect reconnaissance port scans called 
sfPortscan. The module generates an alert to notify administrators of a 
reconnaissance scan by detecting the “negative responses” sent back to the scanner by the 
target machines on the administrator’s network. When the number of these negative 
responses crosses a certain threshold over a specified time period, the reconnaissance 
alert is generated. The alert includes basic information such as scan protocol (ICMP, 
TCP SYN) and type (portscan or portsweep) [8]. The SNORT module does not uniquely 
identify the scanning tool that was utilized. 

2. Passive Fingerprinting 

Just as a fingerprint can effectively distinguish an individual from everyone else, 
the multitude of parameters within all of the protocols that make the Internet work can be 
used to provide specific information about the systems communicating on the Internet. 
Passive fingerprinting techniques can be difficult because they cannot rely on “active” 
measures, like sending out packets to query the target. The principal attributes that must 
be used to identify the fingerprints reside in the TCP/IP model Layer 3 (Network) and 
Layer 4 (Transport) headers [9]. 

Initial efforts to develop a method of passively fingerprinting focused on 
identifying the operating system (OS) of the target machine. Michal Zalewski’s program, 
pOf, has remained the most successful passive OS fingerprinting program. Based on 
extensive empirical tests, pOf compares incoming packets against its database of known 
fingerprints to classify the OS of the computer that sent the packets. Zalewski’s work 
narrowed down the possible combinations of differences in the IP and TCP header fields 
to the most effective fields to examine to make the OS determination [9]. The fingerprint 
determination is made based on static data from each connection to the machine running 
pOf The fingerprint entry format from pOf version 2.0.8 is as follows [10]; 
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# Fingerprint entry format: 

# 

# wwww: ttt: D: ss : 000...: QQ: OS : Details 

# 

# wwww -window size (can be * or %nn or Sxx or Txx) 

# "Snn" (multiple of MSS) and "Tnn" (multiple of MTU) are allowed. 

# ttt -initial TTL 

# D -don't fragment bit (0 - not set, 1 - set) 

# ss -overal SYN packet size (* has a special meaning) 

# 000 -option value and order specification (see below) 

# QQ -quirks list (see below) 

# QS -OS genre (Linux, Solaris, Windows) 

# details -OS description (2.0.27 on x86, etc) 

Tools like pOf can be very effective at examining socket data to make an accurate 
OS determination. However, it has been shown that a tool like pOf can be fooled by 
altering these fingerprint parameters in an OS kernel [11]. Consequently, it is important 
to include dynamic behavioral parameters to the fingerprints. This thesis moves beyond 
OS fingerprinting to determine evidence of static and dynamic fingerprint parameters for 
network reconnaissance scanning tools. It is not the primary function of pOf, but this 
particular tool provides a rudimentary ability to identify a reconnaissance scan performed 
using Nmap. The passive fingerprinting techniques from [9] and [10] formed the starting 
point for the analysis developed in this thesis. 

3, Network Data Visualization 

Most captured network data is still displayed in a textual format, which can make 
it difficult to rapidly analyze or acquire situational awareness (SA) on the network. 
Visualization techniques show great promise to improve the network defender’s ability to 
quickly gain and maintain SA on their network [12]. It is therefore possible to develop 
visual fingerprints that could alone provide unique fingerprints of network 
reconnaissance tools [13]. The research conducted in [13] provided the second main 
approach to analysis that is further developed in this thesis. While these visualizations 
are a powerful additional tool for network defenders, they are vulnerable to occlusion 
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attacks that force the network defender into the same tedious analysis proeess as the 
eurrent textual based systems [14], This thesis pairs some visualization techniques with 
an analysis of the speeific textual hard network data to present a more descriptive 
fingerprint. 

4, The “Advanced Persistent Threat” 

Expert attaekers, sueh as state-sponsored information warriors, have progressed to 
the point that sometimes the first indieation that a network has been eompromised is 
deteeting the exfiltration or eorruption of data. An organization may be suddenly 
undereut by a eompeting firm utilizing teehnology that looks suspieiously exaetly like the 
defender’s own proprietary teehnology [15]. Expert attaekers ean infiltrate networks and 
establish a pervasive presenee eapable of rapidly reacting to defensive taeties to eontinue 
exfiltrating data; aptly termed by one industry leading eompany as the “Advaneed 
Persistent Threat” [16]. Consequently, it is of vital importanee to develop the best early 
indieation and warning that a network is being targeted by deteeting and identifying the 
adversary while the attaeker is still in the reeonnaissance phase of its attaek. 
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III. EXPERIMENTATION METHODOLOGY 


A, ISOLATED LABORATORY TEST NETWORK SETUP 

An isolated laboratory test network was set up to conduct experiments and collect 
data on the characteristics and behavior of several common network reconnaissance tools. 
The experimental network consists of dedicated machines for simulating the attacker, the 
target, and data-capture machine. These computers are linked via standard Category 5e 
Ethernet cable through a commercial grade 3Com 3300 network switch. Both the 
attacker and target were loaded to dual-boot Microsoft Windows XP Service Pack 3 (XP 
SP3) and Ubuntu Linux 9.04 (32-bit version, Linux kernel 2.6.28). A dedicated Dynamic 
Host Configuration Protocol (DHCP) server was installed to assign IP addresses and to 
simulate a typical enterprise network configuration. This DHCP server also provided a 
backup network data-capture capability and allowed for expansion of the test network to 
include additional target hosts. 

A diagram of the isolated laboratory test network can be seen in Ligure 2. 
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B, DATA COLLECTION 

1, Selection and Configuration of Software Tools 

The different tools to be analyzed were seleeted from a number of different 
sourees, including network security texts, such as [4]-[6], [13], [14], [17]-[19], and a 
Web site [20]. 

Once installed and initial tests of each tool were attempted, the group of test 
programs was narrowed down to focus on the default patterns of TCP reconnaissance 
scanning tools. It was discovered that by default, Windows XP SP3 does not reply to 
ICMP messages. Many of the reconnaissance tools, also by default, verify which target 
hosts are reachable by starting scans with an ICMP ping to the target; and only 
proceeding with the TCP port scan against the target machines that replied to the ICMP 
ping. Without a response from the target Windows XP SP3 machine, the follow-on TCP 
port scans would not occur. To keep the evaluation of the reconnaissance tools limited to 
default behavior as much as possible, the target machine’s Windows XP firewall was set 
to allow ICMP responses. Some reconnaissance tools can be configured to conduct scans 
without a preparatory ICMP ping. However, the goal was to observe each tool’s default 
behavior to ascertain its unique fingerprint. Furthermore, many networks have been 
configured to drop all ICMP messages from outside the network as a defensive tactic to 
prevent an ICMP DoS Attack caused by a flood (i.e., large volume) of ICMP packets [4], 
[19]. 

For Windows XP Service Pack 2 and beyond, Windows no longer permits default 
raw socket support for programs, even when running as an administrator. Thus, many of 
the venerable older Windows scanning tools do not function properly with just the default 
installation. These Windows scanning tools, like Superscan by Foundstone, date back as 
far as 2002, with little or no continued development on the tools since that time [21]. 
Consequently, the scanning tools evaluated in this thesis are all Linux based with the 
exception of Zenmap. 

An additional setting modification was made to the Windows XP firewall to 
permit connections to TCP ports 25 and 80. These two ports correspond to Simple Mail 
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Transfer Protocol (SMTP, better known as email) and HyperText Transfer Protocol 
(HTTP, better known as the application backbone of the World Wide Web) [22], Thus, 
the reconnaissance tools could scan the target in their default mode and be assured of 
discovering some information about the target’s TCP ports. 

2. Data Capture 

The target machine maintained a Windows XP SP3 environment to 
simultaneously capture the reconnaissance scan packets with Wireshark [23] and display 
real-time network data visualization with RUMINT [24], New packet capture “.pcap” 
data files were created in Wireshark for each test run of each reconnaissance tool. At 
least three test runs were conducted for each tool in the default configuration. An 
additional three test runs were conducted for each tool to specifically target TCP Ports 
80, 139, and 65000. These ports were selected to create a narrow data set showing a 
listening but closed port, an open port, and a non-listening closed port. Lastly, a test was 
conducted to observe each tool perform a scan against the entire TCP Port range (i.e., 
from 0 to 65,535). However, the principle objective was to observe the default TCP scan. 
The limited and complete port range tests served to verily that the reconnaissance tool 
acted in a similar manner regardless of the number of ports being scanned. 

C. DATA ANALYSIS 

The purpose of analyzing the data captured from these reconnaissance scans is not 
to reverse engineer the tools. Rather, the captured data is analyzed to observe the specific 
static and dynamic characteristics of the tools to create unique fingerprints to identify 
each reconnaissance tool. For the purpose of developing these unique fingerprints, it is 
not necessary to fully explain all of the behaviors of the different reconnaissance tools. 
As will be shown, the deeper the analysis delves, the more questions arise that complicate 
the process. It is adequate to identify the static and dynamic characteristics to develop 
the fingerprint; any further information and analysis may be beneficial, but is beyond the 
scope of this research. 

The next section describes the particular TCP/IP packet header fields that are of 
interest to developing the fingerprints. 
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1 . 


Packet Header Fields 


The following fields from the IP and TCP headers form the basie parameters to 
evaluate to identify a particular tool’s unique fingerprint. Additional information 
regarding the IP and TCP header fields can be found in [25], [26], [18], and [9]. 

a. IP Total Length 

Total Length is a 16-bit IP field stating the entire length of the datagram, 
measured in octets. This is a simple value, but it does help specify which reconnaissance 
tool or OS sent the packet. 

b. IP Identification 

The IP Identification field is a randomized 16-bit value assigned by the 
sender to aid in assembling fragments of a datagram. It should uniquely identify the 
particular datagram from all others originated by the sender. 

c. IP Flags 

The IP Flags consist of 3 bits. The first bit is reserved and should always 
be zero. The second bit is the Don’t Fragment (DF) bit, specifying whether or not a 
datagram should be fragmented or just dropped if it reaches a network segment that 
cannot handle the length of the datagram. Most OSs set DF to 1, meaning Don’t 
Fragment, in order to implement a form of Path Maximum Transmission Unit Discovery 
(PMTUD). This scheme has the sender adjust the maximum size of the packets it sends 
to minimize the possibility of the packets getting fragmented. The last bit is the More 
Fragments flag, signifying there are additional fragments to be reassembled following the 
current packet fragment. The first and last IP flag bits do not play a large role in 
scanning, but may be used extensively in enumeration. 

d. IP Time to Live (TTL) 

The TTL field is 8 bits, allowing a range from 0-255. It prevents packets 
from endlessly circulating the Internet. Each time a packet gets passed on by a router, the 
TTL value is decremented. When the TTL value reaches zero, the packet is dropped. 
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Different OS manufaeturers set this value differently, providing a good indieator of the 
souree machine’s OS. For example, UNIX variants have an initial TTL=255, Windows 
machines have an initial TTL=128, and Linux variants have an initial TTL=64. Due to 
the efficiency of the Internet’s routing, it is rarely necessary for a packet to require more 
than 15 hops to reach its destination. This allows an estimation of the sender’s distance 
in router hops from the recipient. 

e. IP Source Address 

The IP Source Address is the 32-bit IP address of the source of the 
datagram, normally displayed in a dotted decimal format. For the experiment’s 
reconnaissance scans, the Source IP is 10.0.0.3. 

f. IP Destination Address 

The IP Destination Address is the 32-bit IP address of the destination of 
the datagram, also displayed in a dotted decimal format. For the experiment’s 
reconnaissance scans, the Destination IP is 10.0.0.4. 

g. TCP Source Port 

The TCP Source Port is a 16-bit value offering a range between 0 and 
65,535. The TCP Port numbers are divided into three ranges, ports 1-1023 are “well- 
known,” ports 1024-49151 are “registered,” and ports 49152-65535 are “dynamic.” The 
well-known ports offer a common frame of reference for popular services like email at 
port 25 and http at port 80. Since each vendor’s software implements the TCP/IP stack 
slightly differently, the selection of the TCP Source Port number can also provide some 
clues for identifying the OS of the sender. 

h. TCP Destination Port 

The TCP Destination Port is also a 16-bit value, sharing similar properties 
as the Source Port. Since the most common services are running on a well-known port, 
most of the reconnaissance tools send the majority of their scanning packets to the well- 
known ports. 
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L TCP Sequence Number 

The TCP Sequence Number is 32 bits long with possible values between 
0-4,294,967,295. It plays a role similar to the IP ID field to uniquely identify a 
connection, but more importantly it also forms a method of synchronizing and accounting 
for the data transferred between two machines in a TCP conversation along with the TCP 
Acknowledgement Number [26]. The original TCP specification contained a weakness 
in TCP Sequence Number implementation that allowed an attacker to hijack TCP 
connections. Corrections were made to OS kernels to introduce pseudo-randomness into 
the TCP Sequence Number generation, with varying results [27], [28]. Variations in TCP 
Sequence Number behavior were revealed to be one of the best ways to determine the 
dynamic unique fingerprint of the scanning tools. It could also be very useful to uncover 
reconnaissance scans that are conducted slowly over a longer period of time in an attempt 
to evade IDSs. Unfortunately, TCP Sequence Number analysis also proved to be one of 
the most time-consuming parameters to analyze. 

j, TCP Acknowledgement Number 

The TCP Acknowledgement Number is also 32 bits long. If the “ACK” 
Control Flag is set, this field contains the value of the next sequence number the sender 
of the segment is expecting to receive. It completes the method of accounting for the 
amount of data being exchanged between the two computers at each step in the 
communication process as described above. By default, all of the reconnaissance tools 
tested performed “SYN” scans. Consequently, the TCP Acknowledgement Number field 
was always zero and inconsequential to fingerprint analysis. However, many of the tools 
have the functionality to send “ACK” scans, at which point monitoring the TCP 
Acknowledgement Number becomes pertinent. 

k. TCP Control Flags 

The TCP Control Flags are a 6-bit field, from left to right: URG (urgent 

pointer field significant), ACK (acknowledgement field significant), PSH (push 

function), RST (reset the connection), SYN (synchronize sequence numbers), and FIN 

(no more data from sender). As mentioned above, by default, all the reconnaissance 
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scanners tested performed “SYN” seans. The ineoming SYN packets are treated like 
legitimate attempts to initiate a eonneetion by the reeeiving machine. However, the 
scanner has no intention of eompleting the synchronization of the “Three-way TCP 
handshake” to establish normal eommunieations between the seanner and target. 
Consequently, if the target TCP port is open, the target still responds with the proper 
“SYN+ACK” paeket, but the seanning program has moved on to the next target port. 
The seanner remembers that the port that replied with “SYN+ACK” is open to provide 
this important information to the attaeker when the sean is eomplete. However, sinee the 
seanning program is often operating independently of its native OS kernel’s TCP/IP 
staek, when the seanner reeeives the “SYN+ACK” paeket, the kernel responds with a 
“RST” paeket. The kernel had no knowledge of the “SYN” paeket being sent, so from 
the kernel’s perspeetive, the “SYN+ACK” it reeeived does not belong to any eonneetion 
and is therefore “RST.” 

If the target port is closed, with no serviee listening on it, the target 
maehine responds to the “SYN” paeket with a “RST” paeket. This still provides valuable 
information to the seanner. Consequently, many systems and firewalls have been 
eonfigured to simply send no response when a “SYN” paeket is reeeived at a elosed port 
[22]. When the attaeker eonduets a vertieal sean as deseribed in the previous ehapter, 
there are eonsiderably more elosed ports than open ports. These negative responses are 
the primary indieators to an IDS of reeonnaissance seanning aetivity [7]. 

Various eombinations of TCP Control Flags are frequently used in the 
enumeration stage of reeonnaissanee. Certain eombinations may also enable the attaeker 
to pass through a firewall with a weak rule set and still eonduet its reeonnaissanee sean. 

1. TCP Window 

The TCP Window is a 16-bit field that represents the number of data 
oetets, beginning with the one indieated in the aeknowledgement field, whieh the sender 
of the segment is able to aeeept. It is signifieant to fingerprinting whether it is a eonstant 
value, a fixed set of values, or a multiple of the Maximum Segment Size (MSS) from the 
TCP Options field. 
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m. TCP Options 

TCP Options are placed at the end of the TCP header and are multiples of 
8 bits in length. The order and selection of information included in the TCP Options also 
helps specify the sender’s OS. Windows machines exhibit the following order: MSS, no 
operation performed (NOP, a value of 0x01 in the packet), NOP, and Selective ACK 
Permitted. Linux machines output the following order: MSS, Selective ACK Permitted, 
Timestamp, NOP, and Window scale [9]. Similar to these OS TCP Options differences, 
the reconnaissance tools also differ in use of the TCP Options field. 

2. Wireshark Filter 

Packets on the isolated laboratory test network were captured as “.pcap” data files 
with the program Wireshark running on the target machine. Separate files were created 
for each reconnaissance scan and saved so that they could be exported to other machines 
for follow-on analysis. Wireshark was used to filter the displayed data to only include 
the relevant packets sent from the attacker machine, 10.0.0.3 [23]. To facilitate data 
export for further analysis in a spreadsheet, custom column headings in the packet list 
view pane were created. This enabled the detailed packet header fields to be exported 
into comma-delimited value fields in “.csv” files. Figure 3 shows an example screenshot 
from Wireshark. The author found that using a widescreen monitor to display packet 
information made a more effective use of the visual area than the default Wireshark 
configuration. 
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Figure 3. 


Wireshark network data analysis and parsing sereen eapture 


3, Spreadsheet Analysis 

The eomma-delimited value field “.csv” files exported from Wireshark were 
imported into Mierosoft Exeel and eonverted into Exeel spreadsheet files. This provided 
the experimenter with the ability to extraet additional data, for example, eharts of TCP 
Sequenee Number trends. Eigure 4 shows a typieal sereenshot. Note how by itself, the 
rows and eolumns of numerical data are difficult to use to interpret the network activity 
the data represents. 
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35361 > popSs [SYN] Seq:1885629303 Win:2048 Len4 MSS 

36 

43 

14.20759 

0.000003 

10.0.0.3 10.0.0.4 

TCP 

0x2fdd 

12253 

0x00 

45 

24 

35361 

554 

1885629303 

0x02 

2048 

1460 

35361 > rtsp [SYN] Seq=1885629303 Win=2048 Len^ MSS=1 

37 

44 

14.207597 

0.000007 

10.0.0.3 10.0.0.4 

TCP 

0x1025 

4133 

0x00 

59 

4 

35361 

8888 

1885629303 

0x02 

4096 

1460 

35361 >ddl-tcp-l [SYN] Seq=1885629303 Win=i096 Len^O N 

38 

45 

14.207603 

0.000006 

10.0.0.3 10.0.0.4 

TCP 

0x569c 

22172 

0x00 

57 

4 

35361 

111 

1885629303 

0x02 

2048 

1460 

35361 >sunrpc [SYN] Seq:188S629303Wln:2048Len4MSS 

39 

46 

14.207635 

0.000032 

10.0.0.3 10.0.0.4 

TCP 

0x9ee0 

40672 

0x00 

40 

4 

35361 

3389 

1885629303 

0x02 

1024 

1460 

35361 > ms-wbt-senrer |SYN] Seq:1885629303 Win:1024 Le 

40 

47 

14.207638 

0.000003 

10.0.0.3 10.0.0.4 

TCP 

0x603e 

24718 

0x00 

42 

24 

35361 

443 

1885629303 

0x02 

3072 

1460 

35361 > https [SYN] Seq=18S5629303 Win=3072 Len:0 MSS: 


Figure 4. Excel spreadsheet network data analysis 

4, RUMINT Parallel Coordinate Plots 

RUMINT is a network data visualization tool created by Greg Conti [14], [24], 
Among its most useful features are its VCR-like playback interface and parallel 
coordinate plot. The program was run both live, as the reconnaissance scans were 
performed for data capture, and again later to assist in data analysis. The parallel 
coordinate plot is constructed in the following manner: for each field listed on the 
horizontal axis along the bottom, there is a corresponding vertical line displaying the 
linear minimum to maximum potential value for that field. The program allows 
customized selection of horizontal fields to display and color-codes the packet protocols, 
with TCP being green, UDP as orange, and ICMP as blue [14], [24]. An example plot is 
shown in Figure 5. Note how much more rapidly one can develop SA about what activity 
is occurring on the network compared to looking at the screen captures from Wireshark 
and Excel. 
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Figure 5. RUMINT Parallel Coordinate Plot Example 
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IV. EXPERIMENTATION RESULTS 


This chapter reports the result of the methodology diseussed in the previous 
ehapter. Eaeh section addresses one of the reeonnaissanee tools examined. 

A, NMAP 

Nmap, ereated by Gordon “Fyodor” Lyon, is the most famous network 
reeonnaissanee tool [22], [29]. Nmap has undergone eontinuous improvement since its 
initial release in 1997, setting the standard in funetional reeonnaissanee capabilities. 
Nmap offers a eonsiderable number of advaneed seanning features, to include aetive OS 
version deteetion and enumeration. However, this thesis is foeused on the default TCP 
vertieal SYN sean. 

The test seans were initiated with root-level privileges and default settings by the 
terminal eommand sudo nmap 10.0.0.4. If neeessary, Nmap begins by eonverting 
the speeified target into an IPv4 address with a Domain Name System (DNS) lookup. 
Next, it determines if the target host is reachable by sending an ARP request, ICMP eeho 
request ping, or TCP ACK paeket to port 80 in iterative attempts. It then performs a 
reverse-DNS query to eonvert the IPv4 address baek to a name. This reverse-DNS query 
ean be turned off with the “-n” switch, but it is included in default behavior. Deteeting 
this reverse-DNS query can be an alert to presenee of an unauthorized network scan, i.e., 
the attaeker has already gained direct access on the network. However, if the sean were 
eondueted from outside the defender’s network, the defender would not be able to deteet 
these reverse DNS lookups. Nmap then begins its prineiple effort; a TCP SYN sean 
against Fyodor’s empirieally determined 1,000 most used ports. A savvy attaeker may 
modify these 1,000 default seanned ports by ehanging the list of ports to sean in the 
“nmap-services” program file. Consequently, the simple list of the destination ports 
is not as reliable a fingerprint parameter as other behavioral eharaeteristics. The tests 
revealed that a eomplete default Nmap scan consisted of 1,997 packets. A direct 
comparison of these destination ports is left to further researeh. Lastly, Nmap prints its 
sean results for the attaeker [22], [29]. Sinee this thesis is focused on developing the 


23 



network defender’s ability to passively fingerprint the reconnaissance tools, the accuracy, 
and usefulness, of the reconnaissance tool’s output is inconsequential. 


Figure 6 shows the complete RUMINT Parallel Coordinate Plot view of a default 
Nmap scan. Note how many of the possible values for the IP Identification numbers are 
used compared to the apparently nearly singular value for the TCP Sequence Number. 
More analysis of this observation will be made later in this section. The IP Identification 
Number continued to show similar behavior for every recoimaissance tool tested, with 
one notable exception. Consequently, the IP ID field and several others that were of little 
value to developing the fingerprints were filtered out in subsequent RUMINT figures and 
analysis. 



Figure 6. Nmap default scan in RUMINT Parallel Coordinate Plot view 

Figure 7 represents the field headers that provide the best concise illustration of 
each reconnaissance tool’s visual fingerprint. This RUMINT header field format will be 
used for the remainder of the analyses. 

For Nmap’s default scan, note the following; constant packet length. Don’t 
Fragment Flag not set, a spread TTL distribution, single or very small spread Source Port 
selection, the high concentration of packets sent to well-known and registered destination 

ports, and the narrow TCP Sequence Number range. 

24 






RUMINT Parallel Coordinate Plot: Nmap (root) Selected Packet Header Fields 


Length 


Number 


Figure 7. Nmap default sean in RUMINT 

Further analysis in Wireshark showed that the paekets with a TTL value of 64 
were the “RST” packets sent in response to the “SYN+ACK” from the target’s open 
ports. The actual SYN scan activity used an apparently randomized TTL value within a 
set range. The varying TTL value would make it more difficult for the target to 
determine the attacker’s distance in terms of router hops. 

The next figure shows a comparison of the TCP sequence numbers and the IP 
identification numbers. Both are ordinarily supposed to be initialized as a random 
number to make it difficult for an attacker to perform an injection MITM attack by 
guessing the next IP ID or TCP sequence number. Recall that the TCP sequence number 
of SYN packets serve as the Initial Sequence Number to synchronize follow-on 
communications between the sender and receiver. The observation of Figure 8 
demonstrates that Nmap does not provide a large range of initial TCP sequence numbers; 
it appears to be very linear. Each independent vertical axis on the figure shows the 
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minimum and maximum values for the TCP sequence number and IP identification 
fields. This allows a comparison of each field’s use of possible values relative to the 
other. 



Figure 8. Nmap TCP Sequence Number and IP ID Number Whole Scan Series 

Comparison 


Figure 8 shows the results of the comparison for all 1,997 packets in the default 
scan. The wide-ranging spread of the IP identification numbers occludes the apparently 
linear TCP sequence numbers. Figure 9 shows the results for only the first 100 packets. 
The scale of the TCP sequence number vertical axis was also changed to show the trend 
in greater detail. 
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IP ID Number& TCP Sequence NumberTrend Comparison 



Packet Number (increasing•») 


-TCPSequer>ce Numbers 
-IP ID Numbers 


Figure 9. Nmap TCP Sequence Number and IP ID Number First 100 Packets Series 

Comparison 


Suddenly, it is revealed that the TCP sequence number is not wholly linear, but 
follows a distinct trend alternating between two values. This characteristic is not readily 
obvious by watching the playback in RUMINT or in the initial observations of the textual 
display of TCP sequence numbers in Wireshark or Excel data fields. 

As mentioned earlier, the IP identification field maintained a similar randomized 
trend for each test run of every reconnaissance tool tested, with the exception of one. 
Consequently, the IP identification field was dropped from further analysis. The next 
component of this line of analysis was to determine if the Nmap TCP sequence number 
alternating sequence displayed any patterns by itself Figure 10 shows the TCP sequence 
number trend of 200 packets, starting with packet 1,400. 
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Figure 10. Nmap TCP Sequence Number Trend Near End of Scan 


Figure 10 demonstrates that Nmap’s TCP sequence numbers to not maintain a 
constant frequency-alternating pattern. In the test data shown here, the lower TCP 
sequence number was 1885629303 and the higher alternating number was 1885694838. 
The difference between the upper and lower numbers was 65,535. 

Two other interesting occurrences were noted regarding Nmap’s TCP sequence 
number trend. First, Nmap only sends one packet to each port during the course of its 
default scan, with the exception of port 80, which receives three packets. For both the 
second and third packet sent to port 80, the TCP sequence number made a significant 
jump in value. It then resumed the previous alternating trend. The first jump, the second 
packet to port 80, was an even 256 times the base alternating difference value of 65,535 
above the lower sequence number. The second jump, the third packet to port 80, was an 
even 512 times the base alternating difference value of 65,535 above the lower sequence 
number. 
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Second, we observed that the TCP source port also alternates, by a difference of 
one, between two values. In this test, the source ports were 35361 and 35362. It was 
further observed that the source port alternates in the exact same low-high pattern as the 
TCP sequence numbers. This is an additional distinguishing fingerprint of Nmap. These 
distinct saw-tooth patterns were each sufficient to uniquely identify Nmap. Therefore, a 
deeper analysis of these patterns is left to future research. 


As observed in Figure 7, Nmap uses a randomized range of different TTL values. 
These values fall between 37 and 59. If only a few packets were sent, this could make it 
difficult to estimate the sender’s distance away by counting router hops decrementing the 
TTL values. However, with the 1,997 packets sent in the default scan, the spread of TTL 
values between 37 and 59 remains apparent. An illustration of the standardized TTL 
value frequency distribution is shown in Figure 11. The frequency distribution at each 
TTL value does not remain constant, but always followed a pattern similar to the one 
shown. 


Nmap (root) Standardized Packet TTL Distribution 
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Figure 11. Nmap Standardized Packet TTL Frequency Distribution 
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An additional unique characteristic of Nmap is the randomized selection of a TCP 
Window size between one of only four fixed values, 1024, 2048, 3072, and 4096. Since 
these values are fixed, they provide an effective method to rapidly check a dataset for the 
possible presence of Nmap by incorporating the values into a Wireshark display filter. 

The preceding discussion has outlined a number analysis techniques used to 
determine a unique fingerprint for Nmap. Table 1 summarizes the results. 


Nmap V4.90RC1 Fingerprint Summary 

Internet Protocol version 4 


Packet Length 

60 bytes 

Identification 

Randomized over 99.9% of possible values 

Flags 

TTL 

Not Set (0x00) 

Randomized between 37-59 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

Destination Port 

35361-35362 Alternating Sequence 

min=1 to max=65389, from "nmap-sen/ices" program file 

Sequence Number 

Alternating Sequence, difference=65535, less than .01% of possible values 

Acknowledgement Number 
Control Flags 

0, correct for SYN packet 

SYN 

Window 

Random selection of 4 fixed values: 1024, 2048, 3072, 4096 

TCP Options 

Comments 

MSS=1460 bytes, 2 bytes of padding 


#Source Port changes at same time in sequence with TCP Sequence Number 
shifts 


#Scans Port 80 (http) 3 times, all others only once 

#On 2nd and 3rd packet to Port 80, TCP Sequence number step jump to 
higher value (256x and 512x the base difference of 65535), then return to 
normal sequence 

#Window size is not a multiple of MSS 

#TCP Options only MSS, not as expected from a Linux kernel 


Table 1. Nmap(root) Fingerprint Summary 


When reviewing the different data captures for analysis, it was noted that one of 
the early default Nmap captures had accidentally been performed without root level 
privileges. There were several notable differences compared to Nmap run as root, each 
showing that it is likely that the program was relying more on the system kernel to 
perform certain functions. First, the “Don’t Fragment” flag was set, as expected for a 
modern OS utilizing PMTUD. Next, the TTL was 64 for all the packets sent, as expected 
from a Linux kernel. The Window size was 5840; an even multiple of four times the 
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MSS, which was 1460. The TCP options were also set as expeeted for Linux, in the 
following order: MSS, SACK permitted, timestamp, NOP, and window scale. Lastly, the 
TCP sequence numbers were examined and found to oseillate within a band of 
approximately 16,500,000 while monotonically inereasing. The graph in Figure 12 
shows this trend; compare to Figure 10 to see the difference between TCP sequence 
numbers with user privileges and root privileges. 



Figure 12. Nmap (user) TCP Sequence Number Trend 


The methodologies introdueed above were used to examine the remainder of the 
reconnaissance tools. The following seetions detail the notable results. 


B, ZENMAP 

Zenmap is a Windows Graphieal User Interface (GUI) for Nmap [22], [29]. It 
was included for examination to determine potential differenees between the Nmap 
engine implementation in its native Linux versus Windows. The RUMINT visualization 


31 


















































of the default Zenmap sean is shown in Figure 13. Note the following; eonstant paeket 
length, a range of TTL values (looks similar to Nmap), single or very small spread (and 
high numbered) souree port, high eoneentration of paekets sent to well-known and 
registered destination ports, and the narrow TCP sequenee number range. These 
eharaeteristies all look nearly indistinguishable from Nmap. Future researeh with a much 
larger test sample base would be needed to make an accurate determination based on the 
selection of source ports. Figure 13 may make it appear that Zenmap frequently chooses 
a higher source port than Nmap, however, within the tests conducted for this thesis, that 
distinction was an unreliable indicator. 


RUMINT Parallel Coordinate Plot: Zenmap Selected Packet Header Fields 



Packet IP Flag TTL Src IP TCP Src Port TCP Dst Port DstIP TCPSeq 

Length Number 


Figure 13. Zenmap default scan in RUMINT 

To continue the analysis to find a difference between Nmap and Zenmap, two 
graphs to compare TCP sequence number trends were made. The first comparison. 
Figure 14, spans the entire scan, all 1,997 packets in both cases. The spikes that 
correspond to the second and third packets sent to TCP port 80 occur at relatively the 
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same time in both Nmap and Zenmap. Note that the vertical axes are both measured in 
the millions, but the scales do not allow a direct comparison. There is an offset due to 
Nmap and Zenmap initializing the scan sequence on a different TCP sequence number. 
That was to be expected and was incorporated to enable a better relative comparison on 
the same graph. 

Nmap & Zenmap TCP Sequence Series Comparison 
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Figure 14. Nmap and Zenmap TCP Sequence Series Comparison 

Narrowing the comparison to the first 100 packets of each scan, the basic patterns 
are still nearly identical. This can be seen in Figure 15. After further analysis comparing 
the TCP sequence low-high alternation differences, a possible difference was discovered 
in Zenmap. On occasion, both Nmap and Zenmap alternated TCP sequence numbers by 
a difference of 65,537, in addition to the more frequently occurring value of 65,535. 
When the alternating difference was 65,537, Zenmap’s second packet to port 80 was still 
a factor of 256 times the base difference value of 65,537. Zenmap’s third packet to port 
80 broke the trend by not being the expected even factor 512. Instead, the third packet to 
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port 80 had a TCP sequence number that was slightly less than 512 times the base 
difference value. The tests conducted for this research showed that the selection of 
65,537 might occur more frequently with Zenmap than Nmap. Further research is needed 
to confirm this finding. 


Nmap & Zenmap Initial TCP Sequence Comparison 



“Nmap (root )ICP Sequence Series 
-Zenmap TCP Sequence Series 


First 100 Packets 


Figure 15. Nmap and Zenmap First 100 Packets TCP Sequence Comparison 


As shown in Figure 15, Zenmap can be difficult to distinguish from Nmap. There 
is one method that can provide the simplest differentiation between Zenmap and Nmap. 
The “RST” packet sent by the attacker’s host system kernel has that kernel’s default TTL 
value. However, another implementation of network defense may have been employed in 
Windows SP3, since it notably violated the original TCP RFC and did not send the 
“RST” packets. Consequently, while Zenmap conducted its scan, there were no “RST” 
packets sent back to open ports. The data may also be filtered to search for packets from 
the same source IP address with a TTL value of 128, indicating the presence of Windows. 
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Nmap’s native Linux sends baek “RST” paekets with a XXL value of 64. Xhis is a simple 
method to differentiate between Zenmap and Nmap. Xable 2 summarizes the fingerprint 
characteristics of Zenmap. 


Zenmap v4.90RC1 Fingerprint Summary 

Internet Protocol version 4 


Packet Length 

60 bytes 

Identification 

Randomized over 99.9% of possible values 

Flags 

Not Set (0x00) 

TTL 

Randomized between 37-59 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

63373-63374 Alternating Sequence 

Destination Port 

min=1 to max=65389 

Sequence Number 

Alternating Sequence, difference=65537 (anamoly) 

Acknowledgement Number 

0, correct for SYN packet 

Control Flags 

SYN 

Window 

Random selection of 4 fixed values: 1024, 2048, 3072, 4096 

TCP Options 

Comments 

MSS=1460 bytes, 2 bytes of padding 


#Source Port changes at same time in sequence with TCP Sequence Number 
shifts 


#Scans Port 80 (http) 3 times, all others only once 


#On 2nd and 3rd packet to Port 80, TCP Sequence number step jump to 
higher value (256x and 511-984x the base difference of 65537), then return to 
normal sequence 


#Window size is not a multiple of MSS 


#TCP Options only MSS, not as expected from a Windows 
#RST Packet sent with TTL=128 


Xable 2. Zenmap Fingerprint Summary 


C. UNICORNSCAN 

Unicomscan is a fast port scanner developed by Jack Louis [30]. We had 
difficulties getting the software to compile properly from source code, so the tests were 
run from a BackXrack 4 Pre-release version live DVD. BackXrack 4 also uses an Ubuntu 
Linux kernel, so there should be no behavioral differences compared to a normal Ubuntu 
installation [31]. Xhe test was run as root with the command “unicomscan 
10.0.0.4.” Xhe RUMfNX illustration of the default Unicomscan is Figure 16. Note 
the following: multiple packet lengths. Don’t Fragment Flag set, constant XXL value of 
64, a large range of source ports from registered and dynamic (>1023), high 
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concentration of well-known destination ports, with another eluster high in the dynamie 
range, and a broad variation in the TCP sequenee numbers. 



Figure 16. Unieornsean default sean in RUMINT 

The two different paeket lengths are eaused by Unieornsean’s unique large TCP 
Options field information, whieh makes eaeh sean paeket 78 bytes. By eomparison, the 
negative response “RST” paekets sent by the attaeker are only 60 bytes. Unieornsean 
also uses a broader dynamic range of TCP souree ports than the other tools with 
randomized TCP souree ports. 

Figure 17 shows Unieornsean’s unique TCP Sequenee Number trend. Note the 
random oseillation about the eenter range of TCP values with periodie spikes to very low 
values. 
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Unicornscan TCP Sequence NumberTrend 



-TCPSequence Numbers 


Figure 17. Unicornscan TCP Sequence Number Trend (First 100 Packets) 
Unicornscan’s static and dynamic fingerprint parameters are listed in Table 3. 
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Unicornscan vO.4.7 
Internet Protocol version 4 

Fingerprint Summary 

Packet Length 

78 bytes 

Identification 

Randomized over 99.9% of possible values 

Flags 

TTL 

Don't Fragment (0x04) 

Constant 64 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

Randomized min=4108 to max=64925 

Destination Port 

Randomized min=7 to max=65535 

Sequence Number 

Randomized, oscilates about middle range with periodic low spikes 

Acknowledgement Number 
Control Flags 

0, correct for SYN packet 

SYN 

Window 

Constant 16384 

TCP Options 

MSS=1436, NOP, NOP, SACK Permitted, NOP, Window Scale=0, NOP, NOP, 
Timestamps 

Comments 



#Unique TCP Options sequence, differs from Linux and Windows standards 
and all other tools tested 


#Window size is not an even factor of MSS 


#Source Ports randomized, from high in registered range to maximum 


Table 3. Unicornscan Fingerprint Summary 


D. SCANRAND 

Scanrand is part of the Paketto Keiretsu suite of tools developed by Dan 
Kaminsky [32]. By default, Seanrand requires a specified port range in its vertical scan, 
sequentially incrementing the destination port. Scanrand also requires a scan speed 
specification as a precaution to not crash the network due to the large volume of SYN 
scan packets. The help file example was followed as a default scan setup. The command 
was issued as root as follows: “sudo scanrand -blOM 10.0.0.4 : all.” 
RUMINT cannot playback greater than 30,000 packets, so the scan was broken into 
segments for analysis. The trend is evident within the first 10,000 packets and is 
illustrated in RUMINT in Figure 18. Note the following: constant packet length. Don’t 
Fragment Flag set, TTL=255 for the scan packets, single source port, the destination 
port’s sequential increments are more easily observed watching the RUMINT playback 
progress, but the result of the first 10,000 packets can be seen, and finally a very large 
variation in TCP sequence numbers. 
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Figure 18. Scanrand first 10,000 packets in RUMINT 


Similar to other tools that do not use the OS default XXL value, the RSX packets 
conspicuously stand out from the scan packets with their XXL value of 64. 

One interesting fingerprint parameter of Scanrand not shown in Figure 18 is its 
constant IP identification number. None of the other tools had this characteristic, so the 
IP identification field was considered inconsequential to the visual fingerprints. A 
constant or insufficiently random IP identification number would be a serious weakness 
in a normal OS XCP/IP stack, allowing trivial MIXM attacks. Since Scanrand’s goal is a 
reconnaissance scan and not normal XCP/IP communications, the constant IP 
identification field does not adversely affect the performance of the scan. 

While Scanrand’s IP identification number is uniquely constant, its XCP sequence 
number also has a unique characteristic. Scanrand’s XCP sequence number is the most 
varied of any of the tools tested. As can be seen in Figure 19, the XCP sequence numbers 
span nearly the entire range of possible values (99.94% of 0 to 4,294,967,295). 
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ScanrandTCP Sequence NumberTrend 



“TCPSequervce Numbers 


Figure 19. Scanrand TCP Sequence Number Trend (First 100 Packets) 

Scanrand was also notable for its lack of TCP Options. Consequently, Scanrand’s 
scan packets end with 8 bytes of 0x00 padding. Table 4 summarizes the fingerprint 
characteristics of Scanrand. 
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Scanrand v1.10 Fingerprint Summary 

Internet Protocol version 4 


Packet Length 

60 bytes 

Identification 

Constant=255 (OxOOff) 

Flags 

Don’t Fragment (0x04) 

TTL 

Constant 255 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

Constant 19990 

Destination Port 

Sequential, increments by 1 

Sequence Number 

Randomized, largest variation of tools tested 

Acknowledgement Number 

0, correct for SYN packet 

Control Flags 

SYN 

Window 

Constant 4096 

TCP Options 

Comments 

None, 8 bytes of padding (0x00) 


#Unique. IP Identification number is constant 

#Unique, no TCP Options 


#TCP Sequence Number has largest variation of tools tested 
#Sequentially increasing Destination Port Numbers 


Table 4. Scanrand Fingerprint Summary 


E. STROBE 

Strobe was developed by Julian Assange as a “super-optimized,” fast TCP port 
seaimer [33]. The strobe sean was initiated with the eommand “sudo strobe 
10.0.0.4.” The RUMINT illustration of the default Strobe scan is shown in Figure 20. 
Note the following; multiple packet lengths, Don’t Fragment Flag set, a constant TTL 
value of 64, source port selected within a range, destination port sequential increments, 
dispersed but clustered range of TCP sequence numbers. 
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RUMINT Parallel Coordinate Plot: Strobe Selected Packet Header Fields 



Length Number 

Figure 20. Strobe default scan in RUMINT 

Strobe’s TCP source ports were randomized in a range from a minimum of 32774 
to a maximum of 60983. The multiple packet lengths highlight the difference between 
the SYN scan packets and the RST packets sent by the host OS. The SYN scan packets 
are 74 bytes long, while the RST packets are their normal 60 bytes. 

Strobe has a unique TCP sequence number trend, as shown in Figure 21. It 
randomly oscillates in a small band but makes several large step jumps in value 
throughout the course of the scan. 
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strobe TCP Sequence NumberTrend 



Packet Number 


“TCPSequertce Numbers 


Figure 21. Strobe TCP Sequenee Trend (Full Sean) 

Figure 22 shows a elose-up view of the random oseillations by inspeeting the 
region that appears linear in Figure 21, between paekets 1,500 and 2,000. 
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Strobe demonstrated another interesting characteristic, repeated sequential runs of 
destination ports. It scanned ports 1-66, then sent another ARP request for its target, then 
repeated the scan of those ports again. The sequence is shown in the destination port 
graph in Figure 23. 


44 












































































strobe Destination Port Trend 



Figure 23. Strobe TCP Destination Port Trend 
Table 5 summarizes the fingerprint characteristics of Strobe. 
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Strobe v1.06 Fingerprint Summary 

Internet Protocol version 4 


Packet Length 

74 bytes 

Identification 

Randomized 

Flags 

Don't Fragment (0x04) 

TTL 

Constant 64 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

Randomized, min=32774 to max=60983 

Destination Port 

Sequential, increments by 1, several repetitive runs 

Sequence Number 

Randomized in small bands with large step deviations 

Acknowledgement Number 

0, correct for SYN packet 

Control Flags 

SYN 

Window 

Constant 5840 

TCP Options 

Comments 

MSS=1460, SACK Permitted, Timestamps, NOP, Window scale=6 

#Unique Destination Port Sequencing 

#Window size is an even factor of MSS, factor of 4 

#Unique TCP Sequence with small bands and large step deviations 


Tables. Strobe Fingerprint Summary 


F. ANGRY IP SCANNER 

Anton Keks developed the Angry IP Scanner [34], The test scans were performed 
through Angry IP Scanner’s Graphical User Interface (GUI). It did not specify a default 
list of ports to be scanned, so a complete vertical scan from 1 to 65535 was run. Again, 
the capture file was parsed to 10,000 packet increments for analysis. The RUMINT 
visualization of the first 10,000 packets of the scan is shown in Figure 24. Note the 
following; the scan initiates with a UDP packet, multiple packet lengths. Don’t Fragment 
Flag set, constant TTL=64, randomized source port range, sequential destination port 
increments, and TCP sequence number increasing trend. 
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RUMINT Parallel Coordinate Plot: Angry IP Scanner (first 10,000 packets) Selected Packet Header Fields 



Packet IP Flag TTL SrclP TCP Src Port TCP Dst Port DstIP TCPSeq 

Length Number 


Figure 24. Angry IP Scanner first 10,000 packets in RUMINT 

Similar to the previous cases, Angry IP Scanner’s multiple packet lengths are 
caused by the SYN scan packets being 74 bytes long and the RST packets being 60 bytes 
long. It is unique in initiating the scan with a UDP Packet to port 33381. 

Angry IP Scanner’s TCP Sequence Number trend is shown in Figure 25. Note the 
random oscillations within a small range, but the overall trend is linear monotonically 
increasing. 
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Angry IP Scanner TCP Sequence NumberTrend (First 3000 Packets) 



-TCP Sequence Numbers 


Figure 25. 


Angry IP Scanner TCP Sequence Number Trend (First 3000 Packets) 


A summary table of the fingerprint characteristics of the Angry IP Scanner is 
shown in Table 6. 
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Angry IP Scanner v3.0beta4 Fingerprint Summary 

Internet Protocol version 4 


Packet Length 

74 bytes 

Identification 

Randomized 

Flags 

Don't Fragment (0x04) 

TTL 

Constant 64 

Source Address 

10.0.0.3 (attacker) 

Destination Address 

10.0.0.4 (target) 

Transmission Control Protocol 

Source Port 

Randomized. min=32784to max=60991 

Destination Port 

Sequential, increments by 1 

Sequence Number 

Randomized in small band, monotonically increasing 

Acknowledgement Number 

0, correct for SYN packet 

Control Flags 

SYN 

Window 

Constant 1460 

TCP Options 

Comments 

MSS=1460, SACK Permitted. Timestamps, NOP, Window scale=6 


#Scan starts with UDP packet to port 33381, then TCP SYN packet to port 80, 
then scan sequential order 


#Window size equals MSS 


#TCP Sequence Number Trend randomized in band, overall apparent linear 
monotonically increasing 


Table 6. Angry IP Scanner Fingerprint Summary 


G. SUMMARY 

The parsing and analysis process is critical to uniquely identifying the network 
reconnaissance tools. The methodology described above consists of three methods; 
visual fingerprint, TCP sequence number trend analysis, and tabulation of fingerprint 
characteristics. 

The visualizations in RUMINT provide the fastest method to gain situational 
awareness regarding the reconnaissance tool’s overall behavior. However, in the 
presence of background network traffic or multiple scans, the visualization can become 
occluded, necessitating further data parsing before the individual scan can be identified. 
The three methods developed here are complementary, allowing an analyst to make an 
accurate identification of the reconnaissance scan tool. 

One of the most unique fingerprint characteristics of the recoimaissance tools is 
the TCP sequence number trend. However, it is also the most time-consuming analysis to 
perform. Figure 26 compares all of the TCP sequence number trends. 
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TCP Sequence NumberTrend Comparison 



Nmap (root) 

Nmap (user) 

Zenmap 

Unicornscan 

Scanrand 

Strobe 

Angry IP Scanner 


150 200 

Packet Number (First 350) 


Figure 26. TCP Sequence Number Trend Comparison of All Tools Tested 


Despite the large amount of data shown in Figure 26, some of the individual 
fingerprints can still clearly be identified. Scanrand’s broad range would nearly 
completely occlude all other data if it were not drawn with such a small line size. 
Unicornscan’s characteristic oscillations about the center value with periodic low-value 
spikes are also readily apparent. It is more difficult to discern the increasing trend of 
Angry IP Scanner and Nmap with user privileges. Nmap and Zenmap’s distinct 
alternating sawtooth pattern is not visible with this scale, but it is easier to see how 
different that pattern is compared to any of the other tools. 

Table 7 is a fingerprint characteristic summary table to compare all of the tools 
we tested. 
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Reconnaissance Tool SYN Scan Summary Table 


TCP 


Packet Sequence 

Length DF TTL Source Port Destination Number 


Tool Version (bytes) Flag TTL Pattern Pattern Port Pattern Pattern Window MSS TCP Options 


Nmap (root) 

4.90 

60 

N 

37-59 

random 

alternating 

random 

sawtooth 

1024, 

2048, 

3072, 

4096 

1460 

MSS 

Nmap (user) 

490 

74 

Y 

64 

constant 

random 

random 

random band 
increasing 

5840 

1460 

MSS, Sack Permitted, 
Timestamps, NOP, Window 
Scale 6 

Zenmap 

4.90 

60 

N 

37-59 

random 

alternating 

random 

sawtooth 

1024, 

2048, 

3072, 

4096 

1460 

MSS 

Unicornscan 

0.4.7 

78 

Y 

64 

constant 

random 

random 

random about 
central band 
with periodic 
low spikes 

16384 

1436 

MSS, NOP, NOP, SACK 
Permited, NOP, Wndow 

Scale 0, NOP, Timestamp 

Scanrand 

1.10 

60 

Y 

255 

constant 

constant 

sequential 

random 

4096 

0 

None, 8 bytes padding 

Strobe 

1.06 

74 

Y 

64 

constant 

random 

sequential, 
several 
repeat runs 

random band 
with step 
jumps 

1460 

1460 

MSS, Sack Permitted, 
Timestamps, NOP, Window 
Scale 6 

Angry IP Scanner 

3.0 b4 

74 

Y 

64 

constant 

random 

sequential 

random band 
linear 
increasing 

1460 

1460 

MSS, Sack Permitted, 
Timestamps, NOP, Window 
Scale 6 


Table 7. Reconnaissance Tool SYN Scan Fingerprint Summary 


Figure 27 illustrates a flowchart of the reconnaissance fingerprint analysis 
methodology. 
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Reconnaissance Fingerprint Analysis Flowchart 






Legend 

= Wireshark 

= RUMINT 

= Microsoft Excel 

= Decision 


# Can also perform all analysis 


Figure 27. Reconnaissance Fingerprint Analysis Flowchart 
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V. CASE STUDY—NFS CYBER DEFENSE EXERCISE 


A, NPS CDX’08 NETWORK SETUP 

Each year since 2000, the National Security Agency’s Information Assurance 
Directorate sponsors a Cyber Defense Exercise (CDX). The CDX is a competition 
between the five undergraduate service academies with the Naval Postgraduate School 
(NPS) and Air Eorce Institute of Technology (AFIT) also participating. Each school 
designed, built, and then attempted to defend their networks from attack by a U.S. 
Department of Defense (DoD) Red Team. The exercises provide hands-on experience in 
information assurance (lA) and real-world computer network defense (CND) for the 
students [35]. 


The NPS 2008 CDX network was setup, as shown in Figure 28. 



NPS CDX 2008 Network Diagram 
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Figure 28. 





























B, CDX’08 DATA ANALYSIS 

The data analysis began by parsing the large packet capture file to a more 
manageable size with Wireshark. The resulting packet capture file included the early 
stages of the exercise, where the adversary performed initial reconnaissance. This was 
determined by observing the packet capture file in RUMINT. With an indication of a 
reconnaissance scan in the dataset, identified by the large number of TCP Destination 
Ports shown in the RUMINT playback, the packet capture file was filtered to only 
include packets that originated outside the NPS CDX network. 

From this dataset of packets originating outside the NPS CDX network, a large 
number of SYN packets were seen to originate from 10.2.168.245, in the adversary’s 
dark cloud on the upper right side of the network diagram shown in Figure 28. The 
dataset was again parsed down in Wireshark to only include packets from that source IP 
address. A RUMINT visualization of the dataset at this point is shown in Figure 29. 

Since most reconnaissance scans are performed using SYN packets, the dataset 
was filtered to only show SYN packets. There were still numerous targets, destination IP 
addresses, included in this dataset. It was filtered to highlight just one target that 
received a large number of SYN packets, 10.1.90.5. This was the IP address of the NPS 
CDX network’s DNS server, a high value target to the adversary. The NPS DNS server 
is shown in the network diagram. Figure 28, in the lower left comer. 

Two similar scan targets of 10.2.168.245 were identified and examined. The data 
and fingerprint matched that of the DNS server target data. Consequently, the rest of this 
example follows the analysis of the scan that targeted the NPS DNS server. 

The dataset was filtered in Wireshark to display only the source IP 10.2.168.245 
and destination IP 10.1.90.5 and exported to a comma-delimited value fields “.csv” file. 
This file was then imported into an Excel spreadsheet for further analysis. A RUMINT 
visualization of this dataset was also made, shown in Figure 30. 

At this point, the data was analyzed to determine possible reconnaissance tool 
fingerprint matches. 
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c. 


IDENTIFYING LIKELY NMAP SCAN 


The RUMINT illustration in Figure 29 shows the network data originating from 
outside the NFS CDX network, from 10.2.168.245. The large variation of destination 
port numbers provided a erueial initial indicator of reconnaissance scan activity. 
However, there is still too much data present that precludes making accurate fingerprint 
identification by visual analysis alone. 



RUMINT Parallel Coordinate Plot: CDX Enemy Reconnaissance Selected Packet Header Fields 


Length 


Number 


Figure 29. CDX Enemy Reconnaissance Scanning Activity in RUMINT 


Next, the data was further fdtered to just display the packets sent to a single 
target, the NFS CDX DNS server, 10.1.90.5. The RUMINT visualization is shown in 
Figure 30. Note the TTL value spread, small number of source ports, distribution of 
destination ports, and limited number of TCF sequence numbers. At this point, the 
analyst can begin to identify the reconnaissance tool used by visual fingerprint match. 
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RUMINT Parallel Coordinate Plot: CDX Enemy Scan of DNS Server Selected Packet Header Fields 


Length 


eq 
Number 


Figure 30. CDX Enemy Scan of DNS Server; Possible Nmap Scan 


Based on the visual fingerprint in the previous chapter, it appears that the 
reconnaissance-scanning tool was Nmap. However, further specific fingerprint 
characteristics must be checked to validate that determination. The Nmap fingerprint 
summary table was used to check the CDX data, in addition to the most definitive Nmap 
identifier, the TCP sequence number trend. The TCP sequence number trend is shown in 
Figure 31. 
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CDX 10.2.168.245 TCP Sequence (Possible Nmap scan) 
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Figure 31. CDX 10.2.168.245 TCP Sequence (Possible Nmap Scan) 


The distinct alternating sawtooth pattern of Nmap is readily visible. Further 
numerical analysis showed the difference between the alternating numbers to be 65,537. 
The fingerprint match of the RUMINT visualization, TCP sequence number trend 
analysis, and summary data table (Table 8) provides a high-confidence result that the tool 
the adversary used was Nmap. 
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CDX Reconnaissance Scan Fingerprint Match 



CDX Possible 


Fingerprint 

Parameter 

Scan 

Nmap 

Match? 


SYN Packet Length 

60 

60 

Y 

IP Identification 

Random 

Random 

N/A 

1P Flags 

Not Set 

Not Set 

Y 

Initial TTL 

Random 37-59* 

Random 

37-59 

Y 

IP Source Address 

10.2.168.245 

any 

N/A 

IP Destination Address 

10.1.90.5 

any 

N/A 

TCP Source Port 

alternates 
between 2 
numbers 

alternates 
between 2 
numbers 

Y 

TCP Destination Port 

random, 
observed 4- 
32787 

random, 1- 
65389 

Y 

TCP Seq Number 

Sawtooth, 
alternates with 
difference 
=65537 

Sawtooth 
alternates, 
difference 
=65535 or 
65537 

Y 

TCP Ack Number 

0 for SYN 

0 for SYN 

N/A 

TCP Control Flags 

SYN 

SYN 

N/A 

TCP Window 

1024, 2048, 
3072, 4096 

1024, 2048, 
3072, 4096 

Y 

TCP Options 

MSS 

MSS 

Y 


other 


Src IP &TCP Seq change 
together 

Y 

Y 

Y 

*TTL observed 35-57, +1 known router, +1 likely adversary router 


Result: high confidence Nmap fingerprint 


Table 8. CDX Probable Nmap Scan Summary 


D. CONCLUSION 

This example demonstrates the analysis techniques developed in this thesis 
against a real-world cyber defense scenario. It is possible to uniquely identify the 
network reconnaissance tool used. 
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The analysis process also revealed more information about the adversary. The 
network defender knows the adversary used Nmap to scan the network. Based on the 
spread of TTL values, RST packets, and the Nmap scan’s TTL frequency distribution, 
there were two routers between the adversary machine that launched the reconnaissance 
scans and the NFS DNS server. One router is known to be in the defender’s network; 
therefore, the adversary is most likely also situated behind a gateway router. The RST 
packets’ likely initial TTL of 64, Nmap’s most common installation as a Linux tool, and 
other packets’ TCP Options from 10.2.168.245, all suggest that the adversary’s scanning 
machine is running Linux. Thus, in the course of conducting active reconnaissance, the 
adversary has given away information to the defender that may be useful for adjusting 
defenses or for military and law enforcement to take appropriate action in response to the 
reconnaissance activity. 
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VI. CONCLUSION 


A. SUMMARY 

This thesis examined the feasibility of passively fingerprinting several well- 
known network reeonnaissance tools. The analysis methodology developed was shown 
to provide several different meehanisms to identify unique fingerprints. These techniques 
can provide valuable information to network defenders. Because the analysis is based on 
passive observation of network traffic, it does not alert the adversary. Information about 
the attacker can be collected without the attacker’s knowledge. This information can be 
used to tailor defensive responses to the attacker’s reconnaissance activities. 

The results of this research can be summarized in the following main points: 

• The visual fingerprints created in RUMINT confirmed the visual 
fingerprinting techniques first developed in [13]. RUMINT provided the 
most rapid indication and warning of the reconnaissance scans. This 
enabled the cuing of specific portions of a larger data set for further 
analysis. 

• The analysis condensed fingerprints into static and behavioral 
characteristics on summary tables. This enables the network defender to 
make an accurate and timely identification of the reconnaissance tools 
being employed by the attacker. 

• The analysis technique, derived from [9], introduced the application of 
TCP Sequence Number trend analysis to network reconnaissance tools. 
The TCP Sequence Number trend proved to be one of the most 
distinguishing aspects of each tool’s fingerprint. 

• Lastly, the thesis demonstrated that these techniques can be used to 
analyze real-world data from a Cyber Defense Exercise scenario to 
identify the reconnaissance tool an adversary used based on its unique 
fingerprint. 
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B, 


RECOMMENDATIONS 


Parallel coordinate plots in visualization tools such as RUMINT are an effective 
method of visually displaying network data to rapidly gain situational awareness. As 
such, RUMINT is also a superb aid to provide the crucial early indication and warning of 
reconnaissance scanning activity. 

Deep analysis of network traffic data is a time consuming process. Leaders must 
ensure plans take this time consideration into account and set realistic expectations. 

If possible, record; do not throw out data on the network’s external gateway. It is 
possible to identify what tools an adversary uses to conduct network reconnaissance. 
Analysis of this information can produce early indications and warning that an adversary 
is targeting a network or maintaining a persistent intelligence collection effort. 

C. AREAS OF FUTURE RESEARCH 

1, More Detailed Fingerprinting 

This thesis demonstrated a number of methods to fingerprint one particular type 
of network reconnaissance scan; vertical TCP SYN scans. To build a more useful 
fingerprint database of reconnaissance tools, other tools and scan types must be examined 
and their fingerprints catalogued. The following areas warrant further research: 

• Other TCP vertical scan types, such as ACK scans and other exotic 
combinations of TCP Control Flags 

• ICMP, UDP, and IPv6 Scans 

• Horizontal scan identification and fingerprinting 

• Reconnaissance tool implementation on multiple different OSs 

• Network sweep scans 

• Distributed and idle scans, which attempt to hide the scan by masking the 
scan’s source 
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• All of the above mentioned reconnaissanee aeross network distance, where 
the attacker and target are separated by several routers 

• Identifying and fingerprinting network reconnaissance in real-time 

• More effective network data visualization 

• Investigate pseudorandom number weaknesses, such as underlying 
patterns in the IP Identification field, more depth to the TCP Sequence 
Number trends, and specific identifiable randomization patterns in TCP 
Source or Destination Port selection 

• Explore the feasibility of creating behavior-based attacker profiles. 
Analyze the tools used, sequence, and specifics of attack methodology to 
correlate this behavior into an attacker profile database. 

2. Behavior-Based Attacker Profiles 

The fingerprints developed in this thesis are only part of a broader goal: creating a 
database of known attacker behavioral profiles. The fingerprinting analyses 
demonstrated in this thesis may be expanded to cover a multitude of known 
reconnaissance and attack tools. Armed with this knowledge, network defenders and 
forensic analysts may be able to specifically identify attackers based on their 
reconnaissance and attack behaviors [14]. Forensic analysts have developed 
sophisticated capabilities to reconstruct an attacker’s actions [36]. Given the degree of 
technical expertise required to become an expert attacker, it is likely that certain 
behavioral patterns emerge that could be used to identify the attacker. However, the 
attack specifics are often constrained by legal and other types of restrictions, impeding 
sharing of this type of information in the network defense community. 

If Google aggregates behavioral information about individual users that may be 
able to be used to uniquely identify the users, similar approaches may apply to gathering 
information about attackers [37]. The accumulated attacker profdes would provide 
network defenders with key insight into the attacker’s playbook. It would be apparent 
whether the attacker is a script-kiddie, criminal organization, terrorist group, or state- 
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sponsored information warrior. This information could also enable law enforcement or 
military action, enhancing the ability of network defenders to respond appropriately. 

3. Addressing False Positives 

Each time network defenders create a new method or tool to better manage a 
network, the attackers adjust tactics to mitigate the defensive improvements. Matching 
network data to a fingerprint database to identify an adversary’s reconnaissance tools 
poses the risk of creating false positives. Too many false or unsubstantiated alarms could 
reduce or negate the effectiveness of creating automated reconnaissance detection tools. 
However, the process of analyzing the network data with these improved techniques can 
still help the defender by reducing the unknown. The more network behavior is known in 
a concise manner, the better able the network defender will be to investigate and diagnose 
the underlying problem. False positives are a continuous issue, but can be mitigated by 
understanding the way the network is supposed to operate and applying some of the 
analysis techniques outlined in this thesis. 

4, Fingerprint Aggregation and Correlation 

The ease of communications on the Internet also means that attackers can strike 
anytime and anywhere. Both attackers and defenders can leverage the Internet’s 
collaborative environment. Attackers can correlate information gained from attacks on a 
network’s periphery to improve a subsequent attack on the network’s core. Impediments 
to network defense information sharing hamper the defender’s collective ability to 
maintain SA. With openness comes the exposure of vulnerabilities and risk, however, it 
is also an effective method for collectively improving defensive capabilities. This model 
has proven very beneficial for many of the cryptographic algorithms that provide the 
basis of security on the Internet [7]. The open-source collaborative model delivers the 
most adaptable method for aggregating and correlating network defense information like 
reconnaissance fingerprints. More can be done to further this collaborative effort to make 
network defense more time responsive, adaptable, and effective. 
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5, Enhancing the Spectrum of Cyberspace Operations 

The attacker-defender cyberspace operations model may be conceptually limiting. 
Considering attacks as a penetration of a network defensive line is analogous to trench 
warfare. A broader spectrum of cyberspace operations is possible. Armed with 
improved defensive information, such as reconnaissance and attack tool fingerprints, 
attackers’ behavioral profiles, and secure covert channel collaborative communications, a 
range of cyber-maneuver operations is possible. Cyberspace deception and 
counterintelligence operations could be employed to gain more information about 
adversaries and be used to potentially manipulate the adversary’s decision-making 
process. These cyber-maneuver options enable more potential courses of action. The 
enhanced SA would in turn allow more appropriate and tailored responses, from law 
enforcement action to precision counter-strike or deception. Further detail regarding 
these other concepts can be found in [38]-[42]. 
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